Spectral density of the non-backtracking operator
نویسندگان
چکیده
The non-backtracking operator was recently shown to provide a significant improvement when used for spectral clustering of sparse networks. In this paper we analyze its spectral density on large random sparse graphs using a mapping to the correlation functions of a certain interacting quantum disordered system on the graph. On sparse, tree-like graphs, this can be solved efficiently by the cavity method and a belief propagation algorithm. We show that there exists a paramagnetic phase, leading to zero spectral density, that is stable outside a circle of radius √ ρ, where ρ is the leading eigenvalue of the non-backtracking operator. We observe a second-order phase transition at the edge of this circle, between a zero and a non-zero spectral density. The fact that this phase transition is absent in the spectral density of other matrices commonly used for spectral clustering provides a physical justification of the performances of the non-backtracking operator in spectral clustering. Introduction. – Clustering and community detection are central tasks in the study of social, biological, and technological networks. Sparse networks, where the average degree of every node is a constant independent on the size of the network, are arguably the most relevant for applications, and at the same time the most challenging for clustering. Spectral methods are among the most widely used for this task. They are conceptually simply based on the computation of principal eigenvalues and eigenvectors of an operator associated with the network [1]. Most commonly this operator is the adjacency matrix, the Laplacian (symmetrized and/or normalized), the random walk matrix, or the modularity matrix. The spectrum of these matrices generically decomposes into a bulk of noninformative eigenvalues, and some informative eigenvalues separated from the bulk by a gap. The eigenvectors corresponding to the informative eigenvalues are correlated with the cluster structure. However, on sparse networks, spectral clustering based on these commonly used matrices does not perform as well as for instance methods based on Bayesian inference that can perform well even when the tails of the bulk of the spectrum flood the informative eigenvalue [2]. Recently, the author of [3] proposed the so-called nonbacktracking operator for spectral clustering and conjectured that this method is optimal: it is able to find clusters for large random clustered networks (in the stochastic block model) as long as it is information theoretically possible. The non-backtracking matrix B, associated with an undirected graph, encodes adjacency between directed edges. Its element Bi→j,k→l is one if the edge i→ j flows into the edge k → l, i.e. j = k and i 6= l, and zero otherwise. The authors of [3] give theoretical and numerical evidence that apart from the informative eigenvalues the spectrum of this matrix is confined to the circle of radius the square root of the average excess degree of the network, not presenting the so-called Liftshitz tails [2] that spoil the performance of spectral clustering for the other matrices mentioned above. In order to understand better the performance of spectral clustering it is crucial to understand in detail the spectral properties of the associated operators on random graphs. Analytical results for spectral densities of sparse random graphs are largely based on the method of replicas and cavity and were mostly developed and studied for symmetric random matrices [4–8]. The result most relep-1 ar X iv :1 40 4. 77 87 v1 [ co nd -m at .d is -n n] 3 0 A pr 2 01 4
منابع مشابه
Community Detection with the Non-Backtracking Operator
Community detection consists in identification of groups of similar items within a population. In the context of online social networks, it is a useful primitive for recommending either contacts or news items to users. We will consider a particular generative probabilistic model for the observations, namely the so-called stochastic block model and prove that the non-backtracking operator provid...
متن کاملAsymptotic distribution of eigenvalues of the elliptic operator system
Since the theory of spectral properties of non-self-accession differential operators on Sobolev spaces is an important field in mathematics, therefore, different techniques are used to study them. In this paper, two types of non-self-accession differential operators on Sobolev spaces are considered and their spectral properties are investigated with two different and new techniques.
متن کاملFinding communities in sparse networks
Spectral algorithms based on matrix representations of networks are often used to detect communities, but classic spectral methods based on the adjacency matrix and its variants fail in sparse networks. New spectral methods based on non-backtracking random walks have recently been introduced that successfully detect communities in many sparse networks. However, the spectrum of non-backtracking ...
متن کاملCommunity Detection with the z-Laplacian
Community detection is a fundamental problem in network science, with broad applications across the biological and social arenas. A common approach is to leverage the spectral properties of an operator related to the network (most commonly the adjacency matrix or graph Laplacian), though there are regimes where these techniques are known to fail on sparse networks despite the existence of theor...
متن کاملSpectral Clustering of graphs with the Bethe Hessian
Spectral clustering is a standard approach to label nodes on a graph by studying the (largest or lowest) eigenvalues of a symmetric real matrix such as e.g. the adjacency or the Laplacian. Recently, it has been argued that using instead a more complicated, non-symmetric and higher dimensional operator, related to the non-backtracking walk on the graph, leads to improved performance in detecting...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1404.7787 شماره
صفحات -
تاریخ انتشار 2014